NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Adaptive Sharding in Untrusted Environments

https://doi.org/10.1145/3769756

Mehta, Bhavana; Baghel, Nupur; Amiri, Mohammad Javad; Loo, Boon Thau; Marcus, Ryan (December 2025, Proceedings of the ACM on Management of Data)

Distributed data management systems employ data sharding techniques to achieve scalability. Traditional sharding approaches typically operate under the assumption of a trusted environment, where nodes may crash,but do not act adversarially. In untrustworthy environments, however, this assumption is no longer valid. This paper presents Marlin,an adaptive scalable data management system specifically designed for untrustworthy environments. Marlinleverages data sharding to enhance scalability while dynamically redistributing data across clusters to adapt to dynamic workloads. We propose two architectures: a centralized architecture serving as a baseline, which employs hypergraph partitioning within a trusted administrative domain, and a decentralized architecture that eliminates the need for such a trusted domain by managing shards across nodes in a decentralized manner. Both architectures utilize real-time monitoring and adaptive algorithms to dynamically adjust sharding in response to workload characteristics and adversarial conditions. Experimental results show that Marlinmaintain consistent performance under diverse dynamic scenarios in untrustworthy environments by continuously optimizing shard distributions.
more » « less
Free, publicly-accessible full text available December 4, 2026
Learned Offline Query Planning via Bayesian Optimization

https://doi.org/10.1145/3725316

Tao, Jeffrey; Maus, Natalie; Jones, Haydn; Zeng, Yimeng; Gardner, Jacob R; Marcus, Ryan (June 2025, Proceedings of the ACM on Management of Data)

Analytics database workloads often contain queries that are executed repeatedly. Existing optimization techniques generally prioritize keeping optimization cost low, normally well below the time it takes to execute a single instance of a query. If a given query is going to be executed thousands of times, could it be worth investing significantly more optimization time? In contrast to traditional online query optimizers, we propose an offline query optimizer that searches a wide variety of plans and incorporates query execution as a primitive. Our offline query optimizer combines variational auto-encoders with Bayesian optimization to find optimized plans for a given query. We compare our technique to the optimal plans possible with PostgreSQL and recent RL-based systems over several datasets, and show that our technique finds faster query plans.
more » « less
Free, publicly-accessible full text available June 17, 2026
BFTBrain: Adaptive BFT Consensus with Reinforcement Learning

Wu, Chenyuan; Qin, Haoyun; Amiri, Mohammad_Javad; Loo, Boon_Thau; Malkhi, Dahlia; Marcus, Ryan (April 2025, USENIX)

Free, publicly-accessible full text available April 28, 2026
BFTGym: An Interactive Playground for BFT Protocols

https://doi.org/10.14778/3685800.3685850

Qin, Haoyun; Wu, Chenyuan; Amiri, Mohammad Javad; Marcus, Ryan; Loo, Boon Thau (August 2024, Proceedings of the VLDB Endowment)

Byzantine Fault Tolerant (BFT) protocols serve as a fundamental yet intricate component of distributed data management systems in untrustworthy environments. BFT protocols exhibit different design principles and performance characteristics under varying workloads and fault scenarios. The proliferation of BFT protocols and their growing complexity have made it increasingly challenging to analyze the performance and possible application scenarios of each protocol. This demonstration showcasesBFTGym, an interactive platform that allows audience members to (1) evaluate, compare, and gather insights into the performance of various BFT protocols under a wide range of conditions, and (2) prototype new BFT protocols rapidly.
more » « less
Full Text Available
Towards Full Stack Adaptivity in Permissioned Blockchains

https://doi.org/10.14778/3641204.3641216

Wu, Chenyuan; Amiri, Mohammad Javad; Qin, Haoyun; Mehta, Bhavana; Marcus, Ryan; Loo, Boon Thau (January 2024, Proceedings of the VLDB Endowment)

This paper articulates our vision for a learning-based untrustworthy distributed database. We focus on permissioned blockchain systems as an emerging instance of untrustworthy distributed databases and argue that as novel smart contracts, modern hardware, and new cloud platforms arise, future-proof permissioned blockchain systems need to be designed withfull-stack adaptivityin mind. At the application level, a future-proof system must adaptively learn the best-performing transaction processing paradigm and quickly adapt to new hardware and unanticipated workload changes on the fly. Likewise, the Byzantine consensus layer must dynamically adjust itself to the workloads, faulty conditions, and network configuration while maintaining compatibility with the transaction processing paradigm. At the infrastructure level, cloud providers must enable cross-layer adaptation, which identifies performance bottlenecks and possible attacks, and determines at runtime the degree of resource disaggregation that best meets application requirements. Within this vision of the future, our paper outlines several research challenges together with some preliminary approaches.
more » « less
Full Text Available
AdaChain: A Learned Adaptive Blockchain

https://doi.org/10.14778/3594512.3594531

Wu, Chenyuan; Mehta, Bhavana; Amiri, Mohammad Javad; Marcus, Ryan; Loo, Boon Thau (April 2023, Proceedings of the VLDB Endowment)

This paper presents AdaChain , a learning-based blockchain framework that adaptively chooses the best permissioned blockchain architecture to optimize effective throughput for dynamic transaction workloads. AdaChain addresses the challenge in Blockchain-as-a-Service (BaaS) environments, where a large variety of possible smart contracts are deployed with different workload characteristics. AdaChain supports automatically adapting to an underlying, dynamically changing workload through the use of reinforcement learning. When a promising architecture is identified, AdaChain switches from the current architecture to the promising one at runtime in a secure and correct manner. Experimentally, we show that AdaChain can converge quickly to optimal architectures under changing workloads and significantly outperform fixed architectures in terms of the number of successfully committed transactions, all while incurring low additional overhead.
more » « less
Full Text Available
Robust Query Driven Cardinality Estimation under Changing Workloads

https://doi.org/10.14778/3583140.3583164

Negi, Parimarjan; Wu, Ziniu; Kipf, Andreas; Tatbul, Nesime; Marcus, Ryan; Madden, Sam; Kraska, Tim; Alizadeh, Mohammad (February 2023, Proceedings of the VLDB Endowment)

Query driven cardinality estimation models learn from a historical log of queries. They are lightweight, having low storage requirements, fast inference and training, and are easily adaptable for any kind of query. Unfortunately, such models can suffer unpredictably bad performance under workload drift, i.e., if the query pattern or data changes. This makes them unreliable and hard to deploy. We analyze the reasons why models become unpredictable due to workload drift, and introduce modifications to the query representation and neural network training techniques to make query-driven models robust to the effects of workload drift. First, we emulate workload drift in queries involving some unseen tables or columns by randomly masking out some table or column features during training. This forces the model to make predictions with missing query information, relying more on robust features based on up-to-date DBMS statistics that are useful even when query or data drift happens. Second, we introduce join bitmaps, which extends sampling-based features to be consistent across joins using ideas from sideways information passing. Finally, we show how both of these ideas can be adapted to handle data updates. We show significantly greater generalization than past works across different workloads and databases. For instance, a model trained with our techniques on a simple workload (JOBLight-train), with 40ksynthetically generated queries of at most 3 tables each, is able to generalize to the much more complex Join Order Benchmark, which include queries with up to 16 tables, and improve query runtimes by 2× over PostgreSQL. We show similar robustness results with data updates, and across other workloads. We discuss the situations where we expect, and see, improvements, as well as more challenging workload drift scenarios where these techniques do not improve much over PostgreSQL. However, even in the most challenging scenarios, our models never perform worse than PostgreSQL, while standard query driven models can get much worse than PostgreSQL.
more » « less
Full Text Available
SageDB: An Instance-Optimized Data Analytics System

https://doi.org/10.14778/3565838.3565857

Ding, Jialin; Marcus, Ryan; Kipf, Andreas; Nathan, Vikram; Nrusimha, Aniruddha; Vaidya, Kapil; van Renen, Alexander; Kraska, Tim (September 2022, Proceedings of the VLDB Endowment)

Modern data systems are typically both complex and general-purpose. They are complex because of the numerous internal knobs and parameters that users need to manually tune in order to achieve good performance; they are general-purpose because they are designed to handle diverse use cases, and therefore often do not achieve the best possible performance for any specific use case. A recent trend aims to tackle these pitfalls: instance-optimized systems are designed to automatically self-adjust in order to achieve the best performance for a specific use case, i.e., a dataset and query workload. Thus far, the research community has focused on creating instance-optimized database components, such as learned indexes and learned cardinality estimators, which are evaluated in isolation. However, to the best of our knowledge, there is no complete data system built with instance-optimization as a foundational design principle. In this paper, we present a progress report on SageDB, our effort towards building the first instance-optimized data system. SageDB synthesizes various instance-optimization techniques to automatically specialize for a given use case, while simultaneously exposing a simple user interface that places minimal technical burden on the user. Our prototype outperforms a commercial cloud-based analytics system by up to 3X on end-to-end query workloads and up to 250X on individual queries. SageDB is an ongoing research effort, and we highlight our lessons learned and key directions for future work.
more » « less
Full Text Available
Bao: Making Learned Query Optimization Practical

https://doi.org/10.1145/3448016.3452838

Marcus, Ryan; Negi, Parimarjan; Mao, Hongzi; Tatbul, Nesime; Alizadeh, Mohammad; Kraska, Tim (June 2021, SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data)

Full Text Available
Flow-loss: learning cardinality estimates that matter

https://doi.org/10.14778/3476249.3476259

Negi, Parimarjan; Marcus, Ryan; Kipf, Andreas; Mao, Hongzi; Tatbul, Nesime; Kraska, Tim; Alizadeh, Mohammad (July 2021, Proceedings of the VLDB Endowment)

Recently there has been significant interest in using machine learning to improve the accuracy of cardinality estimation. This work has focused on improving average estimation error, but not all estimates matter equally for downstream tasks like query optimization. Since learned models inevitably make mistakes, the goal should be to improve the estimates that make the biggest difference to an optimizer. We introduce a new loss function, Flow-Loss, for learning cardinality estimation models. Flow-Loss approximates the optimizer's cost model and search algorithm with analytical functions, which it uses to optimize explicitly for better query plans. At the heart of Flow-Loss is a reduction of query optimization to a flow routing problem on a certain "plan graph", in which different paths correspond to different query plans. To evaluate our approach, we introduce the Cardinality Estimation Benchmark (CEB) which contains the ground truth cardinalities for sub-plans of over 16 K queries from 21 templates with up to 15 joins. We show that across different architectures and databases, a model trained with Flow-Loss improves the plan costs and query runtimes despite having worse estimation accuracy than a model trained with Q-Error. When the test set queries closely match the training queries, models trained with both loss functions perform well. However, the Q-Error-trained model degrades significantly when evaluated on slightly different queries (e.g., similar but unseen query templates), while the Flow-Loss-trained model generalizes better to such situations, achieving 4 -- 8× better 99th percentile runtimes on unseen templates with the same model architecture and training data.
more » « less
Full Text Available

« Prev Next »

Search for: All records